Overview

Dataset statistics

Number of variables 15
Number of observations 144458
Missing cells 1433421
Missing cells (%) 66.2%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 16.5 MiB
Average record size in memory 120.0 B

Variable types

DateTime 1
Categorical 4
Numeric 10

Dataset

Description Sensor that returns a label identifying the activity performed by the user, accurately detected using low power signals from multiple sensors in the device. This is achieved using Google’s Activity Recognition APIs. Possible activities are: still, in_vehicle, on_bycicle, on_foot, running, tilting, walking. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
Creator Matteo Busso, Massimo Stefan
Author Fausto Giunchiglia, Ivano Bison, Matteo Busso, Ronald Chenu-Abente, Marcelo Rodas Britez, Can Gunel, Giuseppe Veltri, Amalia de Götzen, Peter Kun, Amarsanaa Ganbold, Altangerel Chagnaa, George Gaskell, Miriam Bidoglia, Luca Cernuzzi, Alethia Hume, Jose Luis Zarza, Daniele Miorandi, Carlo Caprini
URL
Copyright (c) KnowDive 2022

Variable descriptions

university University where the experiment took place
experimentid Experiment Id
userid User id
day day showing month(2), day(2)
label The activity name with highest accuracy
timestamp show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
accuracy The highest accuracy for possible activities
still The value of the "still" activity
on_foot The value of the "on_foot" activity
walking The value of the "walking" activity
running The value of the "running" activity
in_vehicle The value of the "in_vehicle" activity
on_bicycle The value of the "on_bycicle" activity
tilting The value of the "tilting" activity
unknown The value of the "unknown" activity

Alerts

tilting has constant value "100.0" Constant
accuracy is highly correlated with still High correlation
still is highly correlated with accuracy High correlation
on_foot is highly correlated with walking High correlation
walking is highly correlated with on_foot High correlation
accuracy is highly correlated with still High correlation
still is highly correlated with accuracy High correlation
on_foot is highly correlated with walking High correlation
walking is highly correlated with on_foot High correlation
accuracy is highly correlated with still High correlation
still is highly correlated with accuracy High correlation
on_foot is highly correlated with walking High correlation
walking is highly correlated with on_foot High correlation
experimentid is highly correlated with university and 1 other fields High correlation
university is highly correlated with experimentid and 1 other fields High correlation
label is highly correlated with tilting High correlation
tilting is highly correlated with experimentid and 2 other fields High correlation
university is highly correlated with experimentid and 2 other fields High correlation
experimentid is highly correlated with university and 1 other fields High correlation
userid is highly correlated with university High correlation
day is highly correlated with university and 1 other fields High correlation
label is highly correlated with accuracy and 6 other fields High correlation
accuracy is highly correlated with label and 5 other fields High correlation
still is highly correlated with label and 6 other fields High correlation
on_foot is highly correlated with label and 5 other fields High correlation
walking is highly correlated with label and 6 other fields High correlation
running is highly correlated with on_foot and 1 other fields High correlation
in_vehicle is highly correlated with label and 2 other fields High correlation
on_bicycle is highly correlated with label and 4 other fields High correlation
unknown is highly correlated with label and 4 other fields High correlation
university has 96200 (66.6%) missing values Missing
experimentid has 96200 (66.6%) missing values Missing
userid has 96200 (66.6%) missing values Missing
day has 96200 (66.6%) missing values Missing
label has 96200 (66.6%) missing values Missing
accuracy has 96200 (66.6%) missing values Missing
still has 96904 (67.1%) missing values Missing
on_foot has 105707 (73.2%) missing values Missing
walking has 105989 (73.4%) missing values Missing
running has 113078 (78.3%) missing values Missing
in_vehicle has 104169 (72.1%) missing values Missing
on_bicycle has 107690 (74.5%) missing values Missing
tilting has 120881 (83.7%) missing values Missing
unknown has 101803 (70.5%) missing values Missing
timestamp has unique values Unique

Reproduction

Analysis started 2022-07-04 17:13:19.565190
Analysis finished 2022-07-04 17:13:50.586082
Duration 31.02 seconds
Software version pandas-profiling v3.2.0
Download configuration config.json

Variables

timestamp
Date

UNIQUE

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct 144458
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 1.1 MiB
Minimum 1900-03-15 17:34:00
Maximum 1900-06-24 01:11:00
2022-07-04T19:13:50.754551 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:51.253735 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

university
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

University where the experiment took place

Distinct 5
Distinct (%) < 0.1%
Missing 96200
Missing (%) 66.6%
Memory size 1.1 MiB
unitn
21888
num
16982
lse
4848
uc
3495
aau
1045

Length

Max length 5
Median length 3
Mean length 3.834700982
Min length 2

Characters and Unicode

Total characters 185055
Distinct characters 10
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row aau
2nd row aau
3rd row aau
4th row aau
5th row aau

Common Values

Value Count Frequency (%)
unitn 21888
15.2%
num 16982
11.8%
lse 4848
3.4%
uc 3495
2.4%
aau 1045
0.7%
(Missing) 96200
66.6%

Length

2022-07-04T19:13:51.524728 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:13:51.757213 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
unitn 21888
45.4%
num 16982
35.2%
lse 4848
10.0%
uc 3495
7.2%
aau 1045
2.2%

Most occurring characters

Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 185055
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

Most occurring scripts

Value Count Frequency (%)
Latin 185055
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

Most occurring blocks

Value Count Frequency (%)
ASCII 185055
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
n 60758
32.8%
u 43410
23.5%
i 21888
11.8%
t 21888
11.8%
m 16982
9.2%
l 4848
2.6%
s 4848
2.6%
e 4848
2.6%
c 3495
1.9%
a 2090
1.1%

experimentid
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Experiment Id

Distinct 2
Distinct (%) < 0.1%
Missing 96200
Missing (%) 66.6%
Memory size 1.1 MiB
wenet
26370
wenetUnitn
21888

Length

Max length 10
Median length 5
Mean length 7.267810518
Min length 5

Characters and Unicode

Total characters 350730
Distinct characters 6
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row wenet
2nd row wenet
3rd row wenet
4th row wenet
5th row wenet

Common Values

Value Count Frequency (%)
wenet 26370
18.3%
wenetUnitn 21888
15.2%
(Missing) 96200
66.6%

Length

2022-07-04T19:13:51.972325 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:13:52.186678 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
wenet 26370
54.6%
wenetunitn 21888
45.4%

Most occurring characters

Value Count Frequency (%)
e 96516
27.5%
n 92034
26.2%
t 70146
20.0%
w 48258
13.8%
U 21888
6.2%
i 21888
6.2%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 328842
93.8%
Uppercase Letter 21888
6.2%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 96516
29.4%
n 92034
28.0%
t 70146
21.3%
w 48258
14.7%
i 21888
6.7%
Uppercase Letter
Value Count Frequency (%)
U 21888
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 350730
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 96516
27.5%
n 92034
26.2%
t 70146
20.0%
w 48258
13.8%
U 21888
6.2%
i 21888
6.2%

Most occurring blocks

Value Count Frequency (%)
ASCII 350730
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 96516
27.5%
n 92034
26.2%
t 70146
20.0%
w 48258
13.8%
U 21888
6.2%
i 21888
6.2%

userid
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

User id

Distinct 60
Distinct (%) 0.1%
Missing 96200
Missing (%) 66.6%
Infinite 0
Infinite (%) 0.0%
Mean 30.01889842
Minimum 1
Maximum 132
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:52.407048 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 4
Q1 11
median 20
Q3 47
95-th percentile 70
Maximum 132
Range 131
Interquartile range (IQR) 36

Descriptive statistics

Standard deviation 23.3663621
Coefficient of variation (CV) 0.778388393
Kurtosis 1.120736143
Mean 30.01889842
Median Absolute Deviation (MAD) 11
Skewness 1.077134101
Sum 1448652
Variance 545.9868779
Monotonicity Not monotonic
2022-07-04T19:13:52.705401 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
20 3333
2.3%
9 2862
2.0%
10 2747
1.9%
17 2392
1.7%
12 2386
1.7%
63 2230
1.5%
27 2058
1.4%
74 1734
1.2%
18 1679
1.2%
1 1538
1.1%
Other values (50) 25299
17.5%
(Missing) 96200
66.6%
Value Count Frequency (%)
1 1538
1.1%
2 416
0.3%
3 57
< 0.1%
4 1079
0.7%
5 899
0.6%
6 194
0.1%
7 617
0.4%
8 902
0.6%
9 2862
2.0%
10 2747
1.9%
Value Count Frequency (%)
132 275
0.2%
124 85
0.1%
75 162
0.1%
74 1734
1.2%
73 98
0.1%
70 962
0.7%
69 131
0.1%
68 1135
0.8%
67 693
0.5%
65 337
0.2%

day
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

day showing month(2), day(2)

Distinct 44
Distinct (%) 0.1%
Missing 96200
Missing (%) 66.6%
Infinite 0
Infinite (%) 0.0%
Mean 464.1346305
Minimum 315
Maximum 624
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:52.994247 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 315
5-th percentile 318
Q1 325
median 404
Q3 611
95-th percentile 619
Maximum 624
Range 309
Interquartile range (IQR) 286

Descriptive statistics

Standard deviation 137.1398884
Coefficient of variation (CV) 0.2954743718
Kurtosis -1.904516334
Mean 464.1346305
Median Absolute Deviation (MAD) 86
Skewness 0.1043596868
Sum 22398209
Variance 18807.34899
Monotonicity Increasing
2022-07-04T19:13:53.259524 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
Value Count Frequency (%)
328 1440
1.0%
324 1415
1.0%
325 1410
1.0%
612 1394
1.0%
327 1392
1.0%
326 1384
1.0%
318 1373
1.0%
329 1362
0.9%
319 1362
0.9%
322 1336
0.9%
Other values (34) 34390
23.8%
(Missing) 96200
66.6%
Value Count Frequency (%)
315 82
0.1%
316 472
0.3%
317 1136
0.8%
318 1373
1.0%
319 1362
0.9%
320 1260
0.9%
321 1323
0.9%
322 1336
0.9%
323 1331
0.9%
324 1415
1.0%
Value Count Frequency (%)
624 1
< 0.1%
622 12
< 0.1%
621 848
0.6%
620 1218
0.8%
619 1257
0.9%
618 1081
0.7%
617 1123
0.8%
616 1179
0.8%
615 1197
0.8%
614 1306
0.9%

label
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

The activity name with highest accuracy

Distinct 6
Distinct (%) < 0.1%
Missing 96200
Missing (%) 66.6%
Memory size 1.1 MiB
Still
28796
Unknown
8249
Tilting
5727
OnFoot
2671
InVehicle
2608

Length

Max length 9
Median length 5
Mean length 5.867897551
Min length 5

Characters and Unicode

Total characters 283173
Distinct characters 20
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Still
2nd row Still
3rd row Still
4th row Still
5th row Still

Common Values

Value Count Frequency (%)
Still 28796
19.9%
Unknown 8249
5.7%
Tilting 5727
4.0%
OnFoot 2671
1.8%
InVehicle 2608
1.8%
OnBycicle 207
0.1%
(Missing) 96200
66.6%

Length

2022-07-04T19:13:53.534560 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:13:53.795399 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
still 28796
59.7%
unknown 8249
17.1%
tilting 5727
11.9%
onfoot 2671
5.5%
invehicle 2608
5.4%
onbycicle 207
0.4%

Most occurring characters

Value Count Frequency (%)
l 66134
23.4%
i 43065
15.2%
t 37194
13.1%
n 35960
12.7%
S 28796
10.2%
o 13591
4.8%
U 8249
2.9%
k 8249
2.9%
w 8249
2.9%
g 5727
2.0%
Other values (10) 27959
9.9%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 229429
81.0%
Uppercase Letter 53744
19.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
l 66134
28.8%
i 43065
18.8%
t 37194
16.2%
n 35960
15.7%
o 13591
5.9%
k 8249
3.6%
w 8249
3.6%
g 5727
2.5%
e 5423
2.4%
c 3022
1.3%
Other values (2) 2815
1.2%
Uppercase Letter
Value Count Frequency (%)
S 28796
53.6%
U 8249
15.3%
T 5727
10.7%
O 2878
5.4%
F 2671
5.0%
I 2608
4.9%
V 2608
4.9%
B 207
0.4%

Most occurring scripts

Value Count Frequency (%)
Latin 283173
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
l 66134
23.4%
i 43065
15.2%
t 37194
13.1%
n 35960
12.7%
S 28796
10.2%
o 13591
4.8%
U 8249
2.9%
k 8249
2.9%
w 8249
2.9%
g 5727
2.0%
Other values (10) 27959
9.9%

Most occurring blocks

Value Count Frequency (%)
ASCII 283173
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
l 66134
23.4%
i 43065
15.2%
t 37194
13.1%
n 35960
12.7%
S 28796
10.2%
o 13591
4.8%
U 8249
2.9%
k 8249
2.9%
w 8249
2.9%
g 5727
2.0%
Other values (10) 27959
9.9%

accuracy
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The highest accuracy for possible activities

Distinct 77
Distinct (%) 0.2%
Missing 96200
Missing (%) 66.6%
Infinite 0
Infinite (%) 0.0%
Mean 84.56430022
Minimum 24
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:54.068988 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 24
5-th percentile 40
Q1 70
median 99
Q3 100
95-th percentile 100
Maximum 100
Range 76
Interquartile range (IQR) 30

Descriptive statistics

Standard deviation 23.51392516
Coefficient of variation (CV) 0.2780597143
Kurtosis -0.4316664299
Mean 84.56430022
Median Absolute Deviation (MAD) 1
Skewness -1.166070129
Sum 4080904
Variance 552.9046766
Monotonicity Not monotonic
2022-07-04T19:13:54.354170 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 20385
14.1%
40 6346
4.4%
99 5374
3.7%
96 2464
1.7%
97 2123
1.5%
98 1593
1.1%
92 561
0.4%
50 414
0.3%
41 386
0.3%
85 323
0.2%
Other values (67) 8289
5.7%
(Missing) 96200
66.6%
Value Count Frequency (%)
24 2
< 0.1%
25 3
< 0.1%
26 9
< 0.1%
27 13
< 0.1%
28 14
< 0.1%
29 26
< 0.1%
30 25
< 0.1%
31 51
< 0.1%
32 44
< 0.1%
33 62
< 0.1%
Value Count Frequency (%)
100 20385
14.1%
99 5374
3.7%
98 1593
1.1%
97 2123
1.5%
96 2464
1.7%
95 250
0.2%
94 250
0.2%
93 234
0.2%
92 561
0.4%
91 178
0.1%

still
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "still" activity

Distinct 101
Distinct (%) 0.2%
Missing 96904
Missing (%) 67.1%
Infinite 0
Infinite (%) 0.0%
Mean 67.43853304
Minimum 0
Maximum 100
Zeros 2
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:54.657083 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 3
Q1 10
median 97
Q3 100
95-th percentile 100
Maximum 100
Range 100
Interquartile range (IQR) 90

Descriptive statistics

Standard deviation 40.14394767
Coefficient of variation (CV) 0.5952672139
Kurtosis -1.425188787
Mean 67.43853304
Median Absolute Deviation (MAD) 3
Skewness -0.6332223744
Sum 3206972
Variance 1611.536535
Monotonicity Not monotonic
2022-07-04T19:13:54.935246 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 15972
11.1%
10 9402
6.5%
99 6025
4.2%
98 1601
1.1%
96 1586
1.1%
97 1518
1.1%
1 1438
1.0%
2 719
0.5%
92 321
0.2%
50 283
0.2%
Other values (91) 8689
6.0%
(Missing) 96904
67.1%
Value Count Frequency (%)
0 2
< 0.1%
1 1438
1.0%
2 719
0.5%
3 266
0.2%
4 281
0.2%
5 131
0.1%
6 142
0.1%
7 78
0.1%
8 170
0.1%
9 47
< 0.1%
Value Count Frequency (%)
100 15972
11.1%
99 6025
4.2%
98 1601
1.1%
97 1518
1.1%
96 1586
1.1%
95 82
0.1%
94 70
< 0.1%
93 81
0.1%
92 321
0.2%
91 66
< 0.1%

on_foot
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "on_foot" activity

Distinct 98
Distinct (%) 0.3%
Missing 105707
Missing (%) 73.2%
Infinite 0
Infinite (%) 0.0%
Mean 21.00807721
Minimum 1
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:55.224493 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 8
median 10
Q3 14
95-th percentile 96
Maximum 100
Range 99
Interquartile range (IQR) 6

Descriptive statistics

Standard deviation 29.28963286
Coefficient of variation (CV) 1.394208169
Kurtosis 2.211614416
Mean 21.00807721
Median Absolute Deviation (MAD) 3
Skewness 1.967311211
Sum 814084
Variance 857.8825928
Monotonicity Not monotonic
2022-07-04T19:13:55.510625 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 16652
11.5%
1 4254
2.9%
2 1790
1.2%
96 979
0.7%
4 920
0.6%
8 808
0.6%
3 739
0.5%
100 698
0.5%
97 694
0.5%
11 675
0.5%
Other values (88) 10542
7.3%
(Missing) 105707
73.2%
Value Count Frequency (%)
1 4254
2.9%
2 1790
1.2%
3 739
0.5%
4 920
0.6%
5 609
0.4%
6 654
0.5%
7 530
0.4%
8 808
0.6%
9 504
0.3%
10 16652
11.5%
Value Count Frequency (%)
100 698
0.5%
99 46
< 0.1%
98 344
0.2%
97 694
0.5%
96 979
0.7%
95 213
0.1%
94 202
0.1%
93 204
0.1%
92 300
0.2%
91 168
0.1%

walking
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "walking" activity

Distinct 94
Distinct (%) 0.2%
Missing 105989
Missing (%) 73.4%
Infinite 0
Infinite (%) 0.0%
Mean 21.01910629
Minimum 0
Maximum 100
Zeros 75
Zeros (%) 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:55.813456 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 8
median 10
Q3 14
95-th percentile 96
Maximum 100
Range 100
Interquartile range (IQR) 6

Descriptive statistics

Standard deviation 29.23189464
Coefficient of variation (CV) 1.390729664
Kurtosis 2.229808091
Mean 21.01910629
Median Absolute Deviation (MAD) 3
Skewness 1.971230936
Sum 808584
Variance 854.5036643
Monotonicity Not monotonic
2022-07-04T19:13:56.309881 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 16748
11.6%
1 4285
3.0%
2 1579
1.1%
96 983
0.7%
4 804
0.6%
8 752
0.5%
3 721
0.5%
97 693
0.5%
11 678
0.5%
100 670
0.5%
Other values (84) 10556
7.3%
(Missing) 105989
73.4%
Value Count Frequency (%)
0 75
0.1%
1 4285
3.0%
2 1579
1.1%
3 721
0.5%
4 804
0.6%
5 608
0.4%
6 604
0.4%
7 533
0.4%
8 752
0.5%
9 507
0.4%
Value Count Frequency (%)
100 670
0.5%
99 41
< 0.1%
98 342
0.2%
97 693
0.5%
96 983
0.7%
95 211
0.1%
94 202
0.1%
93 193
0.1%
92 301
0.2%
91 167
0.1%

running
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "running" activity

Distinct 72
Distinct (%) 0.2%
Missing 113078
Missing (%) 78.3%
Infinite 0
Infinite (%) 0.0%
Mean 8.444741874
Minimum 0
Maximum 100
Zeros 143
Zeros (%) 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:56.609330 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 10
median 10
Q3 10
95-th percentile 10
Maximum 100
Range 100
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 4.979313408
Coefficient of variation (CV) 0.5896347671
Kurtosis 111.5047655
Mean 8.444741874
Median Absolute Deviation (MAD) 0
Skewness 6.884520519
Sum 264996
Variance 24.79356201
Monotonicity Not monotonic
2022-07-04T19:13:56.892138 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 24141
16.7%
1 3678
2.5%
2 1683
1.2%
3 467
0.3%
4 447
0.3%
8 237
0.2%
6 165
0.1%
0 143
0.1%
5 92
0.1%
7 48
< 0.1%
Other values (62) 279
0.2%
(Missing) 113078
78.3%
Value Count Frequency (%)
0 143
0.1%
1 3678
2.5%
2 1683
1.2%
3 467
0.3%
4 447
0.3%
5 92
0.1%
6 165
0.1%
7 48
< 0.1%
8 237
0.2%
9 21
< 0.1%
Value Count Frequency (%)
100 2
< 0.1%
99 6
< 0.1%
98 1
< 0.1%
97 6
< 0.1%
96 2
< 0.1%
95 2
< 0.1%
93 1
< 0.1%
92 3
< 0.1%
90 2
< 0.1%
86 1
< 0.1%

in_vehicle
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "in_vehicle" activity

Distinct 93
Distinct (%) 0.2%
Missing 104169
Missing (%) 72.1%
Infinite 0
Infinite (%) 0.0%
Mean 19.20030281
Minimum 1
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:57.184691 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 7
median 10
Q3 17
95-th percentile 96
Maximum 100
Range 99
Interquartile range (IQR) 10

Descriptive statistics

Standard deviation 26.23587386
Coefficient of variation (CV) 1.366430213
Kurtosis 3.575959102
Mean 19.20030281
Median Absolute Deviation (MAD) 5
Skewness 2.215164537
Sum 773561
Variance 688.3210772
Monotonicity Not monotonic
2022-07-04T19:13:57.480024 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 15155
10.5%
1 5706
3.9%
2 1887
1.3%
8 1182
0.8%
96 1057
0.7%
23 838
0.6%
97 748
0.5%
4 725
0.5%
15 696
0.5%
3 668
0.5%
Other values (83) 11627
8.0%
(Missing) 104169
72.1%
Value Count Frequency (%)
1 5706
3.9%
2 1887
1.3%
3 668
0.5%
4 725
0.5%
5 428
0.3%
6 555
0.4%
7 352
0.2%
8 1182
0.8%
9 362
0.3%
10 15155
10.5%
Value Count Frequency (%)
100 84
0.1%
99 86
0.1%
98 190
0.1%
97 748
0.5%
96 1057
0.7%
95 144
0.1%
94 195
0.1%
93 158
0.1%
92 198
0.1%
91 109
0.1%

on_bicycle
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "on_bycicle" activity

Distinct 90
Distinct (%) 0.2%
Missing 107690
Missing (%) 74.5%
Infinite 0
Infinite (%) 0.0%
Mean 8.700364447
Minimum 0
Maximum 100
Zeros 68
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:57.784359 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 3
median 10
Q3 10
95-th percentile 15
Maximum 100
Range 100
Interquartile range (IQR) 7

Descriptive statistics

Standard deviation 10.37516392
Coefficient of variation (CV) 1.192497622
Kurtosis 47.51706577
Mean 8.700364447
Median Absolute Deviation (MAD) 0
Skewness 6.244231838
Sum 319895
Variance 107.6440263
Monotonicity Not monotonic
2022-07-04T19:13:58.073905 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 18745
13.0%
1 4382
3.0%
2 3496
2.4%
3 1922
1.3%
4 1648
1.1%
6 988
0.7%
8 986
0.7%
5 940
0.7%
7 474
0.3%
12 334
0.2%
Other values (80) 2853
2.0%
(Missing) 107690
74.5%
Value Count Frequency (%)
0 68
< 0.1%
1 4382
3.0%
2 3496
2.4%
3 1922
1.3%
4 1648
1.1%
5 940
0.7%
6 988
0.7%
7 474
0.3%
8 986
0.7%
9 280
0.2%
Value Count Frequency (%)
100 53
< 0.1%
99 53
< 0.1%
98 26
< 0.1%
97 75
0.1%
96 28
< 0.1%
95 8
< 0.1%
94 6
< 0.1%
93 8
< 0.1%
92 8
< 0.1%
91 9
< 0.1%

tilting
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

The value of the "tilting" activity

Distinct 1
Distinct (%) < 0.1%
Missing 120881
Missing (%) 83.7%
Memory size 1.1 MiB
100.0
23577

Length

Max length 5
Median length 5
Mean length 5
Min length 5

Characters and Unicode

Total characters 117885
Distinct characters 3
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 100.0
2nd row 100.0
3rd row 100.0
4th row 100.0
5th row 100.0

Common Values

Value Count Frequency (%)
100.0 23577
16.3%
(Missing) 120881
83.7%

Length

2022-07-04T19:13:58.330916 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:13:58.528028 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
100.0 23577
100.0%

Most occurring characters

Value Count Frequency (%)
0 70731
60.0%
1 23577
20.0%
. 23577
20.0%

Most occurring categories

Value Count Frequency (%)
Decimal Number 94308
80.0%
Other Punctuation 23577
20.0%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 70731
75.0%
1 23577
25.0%
Other Punctuation
Value Count Frequency (%)
. 23577
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 117885
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
0 70731
60.0%
1 23577
20.0%
. 23577
20.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 117885
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
0 70731
60.0%
1 23577
20.0%
. 23577
20.0%

unknown
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "unknown" activity

Distinct 87
Distinct (%) 0.2%
Missing 101803
Missing (%) 70.5%
Infinite 0
Infinite (%) 0.0%
Mean 16.16113
Minimum 0
Maximum 100
Zeros 209
Zeros (%) 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:13:58.737891 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 1
Q1 1
median 2
Q3 40
95-th percentile 44
Maximum 100
Range 100
Interquartile range (IQR) 39

Descriptive statistics

Standard deviation 20.41814927
Coefficient of variation (CV) 1.263410991
Kurtosis -0.2534925914
Mean 16.16113
Median Absolute Deviation (MAD) 1
Skewness 0.950896691
Sum 689353
Variance 416.9008195
Monotonicity Not monotonic
2022-07-04T19:13:59.028119 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1 15165
10.5%
40 11364
7.9%
2 9436
6.5%
3 1957
1.4%
8 480
0.3%
41 379
0.3%
50 248
0.2%
15 227
0.2%
0 209
0.1%
31 192
0.1%
Other values (77) 2998
2.1%
(Missing) 101803
70.5%
Value Count Frequency (%)
0 209
0.1%
1 15165
10.5%
2 9436
6.5%
3 1957
1.4%
4 68
< 0.1%
6 16
< 0.1%
8 480
0.3%
9 2
< 0.1%
10 12
< 0.1%
11 2
< 0.1%
Value Count Frequency (%)
100 17
< 0.1%
98 11
< 0.1%
96 7
< 0.1%
94 12
< 0.1%
93 4
< 0.1%
92 19
< 0.1%
91 9
< 0.1%
90 6
< 0.1%
89 16
< 0.1%
88 4
< 0.1%

Interactions

2022-07-04T19:13:45.377629 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:24.045712 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:26.386785 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:28.704625 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:31.311886 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:33.599482 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:35.913147 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:38.387778 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:40.618281 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:42.919527 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:45.608760 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:24.280864 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:26.625409 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:28.948169 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:31.544160 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:33.831676 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:36.147580 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:38.613353 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:40.859783 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:43.151339 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:45.845720 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:24.520781 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:26.865657 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:29.192765 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:31.783509 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:34.071365 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:36.388870 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:38.837785 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:41.094673 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:43.384646 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:46.090563 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:24.766258 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:27.113204 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:29.444994 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:32.035184 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:34.315212 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:36.635283 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:39.070995 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:41.339590 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:43.820116 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:46.325529 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:25.001397 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:27.338002 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:29.679272 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:32.258428 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:34.540389 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:36.863350 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:39.288026 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:41.564873 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:44.042684 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:46.548989 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:25.230357 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:27.562158 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:29.913237 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:32.480153 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:34.765363 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:37.088259 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:39.508196 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:41.793536 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:44.264422 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:46.771007 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:25.455742 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:27.784365 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:30.149480 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:32.701549 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:34.996596 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:37.502859 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:39.726959 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:42.016893 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:44.485495 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:46.994855 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:25.693144 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:28.015265 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:30.383263 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:32.925534 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:35.222912 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:37.721869 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:39.943101 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:42.240401 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:44.707047 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:47.225827 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:25.924307 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:28.241055 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:30.625179 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:33.149739 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:35.448701 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:37.940833 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:40.165457 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:42.465624 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:44.929564 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:47.453157 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:26.156356 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:28.471654 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:31.075392 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:33.375528 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:35.682832 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:38.171263 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:40.391145 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:42.694765 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:13:45.155213 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-07-04T19:13:59.280015 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient ( ρ ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r . It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y , one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-04T19:13:59.611335 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient ( r ) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r .

To calculate r for two variables X and Y , one divides the covariance of X and Y by the product of their standard deviations.
2022-07-04T19:13:59.948366 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient ( τ ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y , one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-04T19:14:00.270897 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here .
2022-07-04T19:14:00.515407 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here .

Missing values

2022-07-04T19:13:47.864758 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-04T19:13:48.665281 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-04T19:13:49.652411 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-04T19:13:50.306403 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.